R

R is an interpreted language designed to do effective vector and matrix calculations for scientific purposes (Scientists are the only people who like arrays and data frames etc.)

R is amazing because:

- It's free
- It's multi platform
- It's actively developed
- It has wide community of package developpers
- It has command line - interpretted language
- It can draw beautiful graphs
- It's easy
- Soooo much more

Let's have a look at what R can do:

Plotting bus data

https://gallery.shinyapps.io/086-bus-dashboard/ (definitely questionable as far as usefullnes goes, but what the heck)

Example of simple R scripts


In [7]:
library(ggplot2)
head(mpg) #function
summary(mpg) #function
g = ggplot(mpg, aes(class))
# Number of cars in each class:
g + geom_bar()


manufacturermodeldisplyearcyltransdrvctyhwyflclass
audi a4 1.8 1999 4 auto(l5) f 18 29 p compact
audi a4 1.8 1999 4 manual(m5)f 21 29 p compact
audi a4 2.0 2008 4 manual(m6)f 20 31 p compact
audi a4 2.0 2008 4 auto(av) f 21 30 p compact
audi a4 2.8 1999 6 auto(l5) f 16 26 p compact
audi a4 2.8 1999 6 manual(m5)f 18 26 p compact
 manufacturer          model               displ            year     
 Length:234         Length:234         Min.   :1.600   Min.   :1999  
 Class :character   Class :character   1st Qu.:2.400   1st Qu.:1999  
 Mode  :character   Mode  :character   Median :3.300   Median :2004  
                                       Mean   :3.472   Mean   :2004  
                                       3rd Qu.:4.600   3rd Qu.:2008  
                                       Max.   :7.000   Max.   :2008  
      cyl           trans               drv                 cty       
 Min.   :4.000   Length:234         Length:234         Min.   : 9.00  
 1st Qu.:4.000   Class :character   Class :character   1st Qu.:14.00  
 Median :6.000   Mode  :character   Mode  :character   Median :17.00  
 Mean   :5.889                                         Mean   :16.86  
 3rd Qu.:8.000                                         3rd Qu.:19.00  
 Max.   :8.000                                         Max.   :35.00  
      hwy             fl               class          
 Min.   :12.00   Length:234         Length:234        
 1st Qu.:18.00   Class :character   Class :character  
 Median :24.00   Mode  :character   Mode  :character  
 Mean   :23.44                                        
 3rd Qu.:27.00                                        
 Max.   :44.00                                        

Lets have a look at some statisticss


In [8]:
mpg = ggplot2::mpg
# fitting simple anova model
cty_man_aov = aov(cty ~ manufacturer, mpg)
summary(cty_man_aov)


              Df Sum Sq Mean Sq F value Pr(>F)    
manufacturer  14   2348  167.70   19.61 <2e-16 ***
Residuals    219   1873    8.55                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Looks like there is something going on. Let's plot it.


In [9]:
g = ggplot(mpg, aes(manufacturer, cty))
g + geom_boxplot()


Our job is done, paper published.

Basic stuff

There are variables:


In [ ]:
number = 5
string = "Hello World!"

There are simple math problems


In [ ]:
5 * 5
5 * num
10 % 7

There are simple comparisons


In [ ]:
5 == 1
5 >= number
"Hello World!" != string

And there are functions


In [18]:
vector = c(1:10)
sum(vector)
max(vector)
paste(vector, collapse = "-")


55
10
'1-2-3-4-5-6-7-8-9-10'

There are graphs


In [23]:
hist(iris$Sepal.Length)


There are comments


In [24]:
number = 5
# I will not do anything that is written like this
# number = 6
print(number)


[1] 5

There is embedded help


In [26]:
?max

Now you know everything!